AITopics | public leaderboard

Collaborating Authors

public leaderboard

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TabArena: A Living Benchmark for Machine Learning on Tabular Data

Neural Information Processing SystemsJun-10-2026, 16:13:54 GMT

With the growing popularity of deep learning and foundation models for tabular data, the need for standardized and reliable benchmarks is higher than ever. However, current benchmarks are static. Their design is not updated even if flaws are discovered, model versions are updated, or new models are released. To address this, we introduce TabArena, the first continuously maintained living tabular benchmarking system. To launch TabArena, we manually curate a representative collection of datasets and well-implemented models, conduct a large-scale benchmarking study to initialize a public leaderboard, and assemble a team of experienced maintainers.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Kaggle Chronicles: 15 Years of Competitions, Community and Data Science Innovation

Bönisch, Kevin, Losaria, Leandro

arXiv.org Machine LearningNov-21-2025

Since 2010, Kaggle has been a platform where data scientists from around the world come together to compete, collaborate, and push the boundaries of Data Science. Over these 15 years, it has grown from a purely competition-focused site into a broader ecosystem with forums, notebooks, models, datasets, and more. With the release of the Kaggle Meta Code and Kaggle Meta Datasets, we now have a unique opportunity to explore these competitions, technologies, and real-world applications of Machine Learning and AI. And so in this study, we take a closer look at 15 years of data science on Kaggle - through metadata, shared code, community discussions, and the competitions themselves. We explore Kaggle's growth, its impact on the data science community, uncover hidden technological trends, analyze competition winners, how Kagglers approach problems in general, and more. We do this by analyzing millions of kernels and discussion threads to perform both longitudinal trend analysis and standard exploratory data analysis. Our findings show that Kaggle is a steadily growing platform with increasingly diverse use cases, and that Kagglers are quick to adapt to new trends and apply them to real-world challenges, while producing - on average - models with solid generalization capabilities. We also offer a snapshot of the platform as a whole, highlighting its history and technological evolution. Finally, this study is accompanied by a video (https://www.youtube.com/watch?v=YVOV9bIUNrM) and a Kaggle write-up (https://kaggle.com/competitions/meta-kaggle-hackathon/writeups/kaggle-chronicles-15-years-of-competitions-communi) for your convenience.

data mining, large language model, machine learning, (22 more...)

arXiv.org Machine Learning

2511.06304

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Banking & Finance (0.67)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

Linear Dimensionality Reduction for Word Embeddings in Tabular Data Classification

Ressel, Liam, Gardi, Hamza A. A.

arXiv.org Artificial IntelligenceSep-17-2025

The Engineers' Salary Prediction Challenge requires classifying salary categories into three classes based on tabular data. The job description is represented as a 300-dimensional word embedding incorporated into the tabular features, drastically increasing dimensionality. Additionally, the limited number of training samples makes classification challenging. Linear dimensionality reduction of word embeddings for tabular data classification remains underexplored. This paper studies Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). We show that PCA, with an appropriate subspace dimension, can outperform raw embeddings. LDA without regularization performs poorly due to covariance estimation errors, but applying shrinkage improves performance significantly, even with only two dimensions. We propose Partitioned-LDA, which splits embeddings into equal-sized blocks and performs LDA separately on each, thereby reducing the size of the covariance matrices. Partitioned-LDA outperforms regular LDA and, combined with shrinkage, achieves top-10 accuracy on the competition public leaderboard. This method effectively enhances word embedding performance in tabular data classification with limited training samples.

artificial intelligence, job description, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.12346

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Multi-Label Plant Species Prediction with Metadata-Enhanced Multi-Head Vision Transformers

Herasimchyk, Hanna, Labryga, Robin, Prusina, Tomislav

arXiv.org Artificial IntelligenceAug-15-2025

We present a multi-head vision transformer approach for multi-label plant species prediction in vegetation plot images, addressing the PlantCLEF 2025 challenge. The task involves training models on single-species plant images while testing on multi-species quadrat images, creating a drastic domain shift. Our methodology leverages a pre-trained DINOv2 Vision Transformer Base (ViT-B/14) backbone with multiple classification heads for species, genus, and family prediction, utilizing taxonomic hierarchies. Key contributions include multi-scale tiling to capture plants at different scales, dynamic threshold optimization based on mean prediction length, and ensemble strategies through bagging and Hydra model architectures. The approach incorporates various inference techniques including image cropping to remove non-plant artifacts, top-n filtering for prediction constraints, and logit thresholding strategies. Experiments were conducted on approximately 1.4 million training images covering 7,806 plant species. Results demonstrate strong performance, making our submission 3rd best on the private leaderboard. Our code is available at https://github.com/geranium12/plant-clef-2025/tree/v1.0.0.

artificial intelligence, machine learning, prediction, (19 more...)

arXiv.org Artificial Intelligence

2508.10457

Country: Europe (0.68)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ReXrank: A Public Leaderboard for AI-Powered Radiology Report Generation

Zhang, Xiaoman, Zhou, Hong-Yu, Yang, Xiaoli, Banerjee, Oishi, Acosta, Julián N., Miller, Josh, Huang, Ouwen, Rajpurkar, Pranav

arXiv.org Artificial IntelligenceNov-22-2024

AI-driven models have demonstrated significant potential in automating radiology report generation for chest X-rays. However, there is no standardized benchmark for objectively evaluating their performance. To address this, we present ReXrank, https://rexrank.ai, a public leaderboard and challenge for assessing AI-powered radiology report generation. Our framework incorporates ReXGradient, the largest test dataset consisting of 10,000 studies, and three public datasets (MIMIC-CXR, IU-Xray, CheXpert Plus) for report generation assessment. ReXrank employs 8 evaluation metrics and separately assesses models capable of generating only findings sections and those providing both findings and impressions sections. By providing this standardized evaluation framework, ReXrank enables meaningful comparisons of model performance and offers crucial insights into their robustness across diverse clinical settings. Beyond its current focus on chest X-rays, ReXrank's framework sets the stage for comprehensive evaluation of automated reporting across the full spectrum of medical imaging.

ai-powered radiology report generation, public leaderboard, rexrank

arXiv.org Artificial Intelligence

2411.15122

Genre: Research Report (0.40)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.80)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Vygotsky Distance: Measure for Benchmark Task Similarity

Surkov, Maxim K., Yamshchikov, Ivan P.

arXiv.org Artificial IntelligenceFeb-26-2024

Evaluation plays a significant role in modern natural language processing. Most modern NLP benchmarks consist of arbitrary sets of tasks that neither guarantee any generalization potential for the model once applied outside the test set nor try to minimize the resource consumption needed for model evaluation. This paper presents a theoretical instrument and a practical algorithm to calculate similarity between benchmark tasks, we call this similarity measure "Vygotsky distance". The core idea of this similarity measure is that it is based on relative performance of the "students" on a given task, rather that on the properties of the task itself. If two tasks are close to each other in terms of Vygotsky distance the models tend to have similar relative performance on them. Thus knowing Vygotsky distance between tasks one can significantly reduce the number of evaluation tasks while maintaining a high validation quality. Experiments on various benchmarks, including GLUE, SuperGLUE, CLUE, and RussianSuperGLUE, demonstrate that a vast majority of NLP benchmarks could be at least 40% smaller in terms of the tasks included. Most importantly, Vygotsky distance could also be used for the validation of new tasks thus increasing the generalization potential of the future NLP models.

benchmark, leaderboard, vygotsky distance, (15 more...)

arXiv.org Artificial Intelligence

2402.1489

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Enhancing Model Performance in Multilingual Information Retrieval with Comprehensive Data Engineering Techniques

Zhang, Qi, Yang, Zijian, Huang, Yilun, Chen, Ze, Cai, Zijian, Wang, Kangxu, Zheng, Jiewen, He, Jiarong, Gao, Jin

arXiv.org Artificial IntelligenceFeb-14-2023

In this paper, we present our solution to the Multilingual Information Retrieval Across a Continuum of Languages (MIRACL) challenge of WSDM CUP 2023\footnote{https://project-miracl.github.io/}. Our solution focuses on enhancing the ranking stage, where we fine-tune pre-trained multilingual transformer-based models with MIRACL dataset. Our model improvement is mainly achieved through diverse data engineering techniques, including the collection of additional relevant training data, data augmentation, and negative sampling. Our fine-tuned model effectively determines the semantic relevance between queries and documents, resulting in a significant improvement in the efficiency of the multilingual information retrieval process. Finally, Our team is pleased to achieve remarkable results in this challenging competition, securing 2nd place in the Surprise-Languages track with a score of 0.835 and 3rd place in the Known-Languages track with an average nDCG@10 score of 0.716 across the 16 known languages on the final leaderboard.

information retrieval, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2302.0701

Country:

Asia > Singapore (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Some Practice for Improving the Search Results of E-commerce

Wu, Fanyou, Liu, Yang, Gazo, Rado, Bedrich, Benes, Qu, Xiaobo

arXiv.org Artificial IntelligenceJul-29-2022

Substitute (S): the item is somewhat relevant: it fails to fulfill some aspects of the query, but the item can be used as In the Amazon KDD Cup 2022, we aim to apply natural language a functional substitute; processing methods to improve the quality of search results that can Complement (C): the item does not fulfill the query but significantly enhance user experience and engagement with search could be used in combination with an exact item; engines for e-commerce. We discuss our practical solution for this Irrelevant (I): the item is irrelevant, or it fails to fulfill a competition, ranking 6th in task one, 2nd in task two, and 2nd in central aspect of the query.

leaderboard, public leaderboard, query, (13 more...)

arXiv.org Artificial Intelligence

2208.00108

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.05)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.05)
Asia > China > Beijing > Beijing (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Services > e-Commerce Services (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.89)
Information Technology > e-Commerce (0.86)
Information Technology > Information Management > Search (0.71)

Add feedback

NeurIPS 2019 Disentanglement Challenge: Improved Disentanglement through Learned Aggregation of Convolutional Feature Maps

Seitzer, Maximilian, Foltyn, Andreas, Kemeth, Felix P.

arXiv.org Machine LearningFeb-27-2020

This report to our stage 2 submission to the NeurIPS 2019 disentanglement challenge presents a simple image preprocessing method for learning disentangled latent factors. We propose to train a variational autoencoder on regionally aggregated feature maps obtained from networks pretrained on the ImageNet database, utilizing the implicit inductive bias contained in those features for disentanglement. This bias can be further enhanced by explicitly fine-tuning the feature maps on auxiliary tasks useful for the challenge, such as angle, position estimation, or color classification. Our approach achieved the 2nd place in stage 2 of the challenge (AIcrowd, 2019). Code is available at https://github.com/

dataset, feature vector, submission, (13 more...)

arXiv.org Machine Learning

2002.12356

Country: Europe > Germany (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

Machine Learning Model Best Practices

#artificialintelligenceNov-24-2019, 23:05:53 GMT

The number of shiny models out there can be overwhelming, which means a lot of times people fallback on a few they trust the most, and use them on all new problems. This can lead to sub-optimal results. Today we're going to learn how to quickly and efficiently narrow down the space of available models to find those that are most likely to perform best on your problem type. We'll also see how we can keep track of our models' performances using Weights and Biases and compare them. Unlike Lord of the Rings, in machine learning there is no one ring (model) to rule them all.

dataset, kaggle competition, machine learning model best practice, (11 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)

Add feedback